AITopics | intention recognition

Collaborating Authors

intention recognition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MIRAGE: Multimodal Intention Recognition and Admittance-Guided Enhancement in VR-based Multi-object Teleoperation

Sun, Chi, Wang, Xian, Kumar, Abhishek, Cui, Chengbin, Lee, Lik-Hang

arXiv.org Artificial IntelligenceSep-3-2025

This is the author's version of the article. T o appear in an IEEE ISMAR conference. The Hong Kong Polytechnic UniversityFigure 1: A pictorial description of the MIRAGE framework that enhances HRI tele-grasping capability for multiple objects in VR. MIRAGE divides the multi-object grasping task into two phases: movement (manual) and grasping (semi-automatic). Each phase has a specific assistance method designed in MIRAGE: In the movement (manual) phase, Virtual Admittance (VA) modifies the robot trajectory (b), comparing to the non-VA condition (a), is easier to motivate the robot to approach target through the same hand movement; in the grasping (semi-automatic) phase, a Multimodal-CNN-based Human Intention Perception Network (MMIPN) is proposed to estimate the human desired grasp position for robot grasp motion plan (d), and the non-MMIPN condition plans the grasping motion as a vertical downward path (c). Effective human-robot interaction (HRI) in multi-object teleoper-ation tasks faces significant challenges due to perceptual ambiguities in virtual reality (VR) environments and the limitations of single-modality intention recognition. This paper proposes a shared control framework that combines a virtual admittance (V A) model with a Multimodal-CNN-based Human Intention Perception Network (MMIPN) to enhance teleoperation performance and user experience. The V A model employs artificial potential fields to guide operators toward target objects by adjusting admittance force and optimizing motion trajectories. MMIPN processes multi-modal inputs--gaze movement, robot motions, and environmental context--to estimate human grasping intentions, helping overcome depth perception challenges in VR. Gaze data emerged as the most crucial input modality. These findings demonstrate the effectiveness of combining multimodal cues with implicit guidance in VR-based teleoperation, providing a robust solution for multi-object grasping tasks and enabling more natural interactions across various applications in the future. With the rapid development of robotics and metaverse technology, in particular, teleoperation technology has brought diverse modes and expanded opportunities for remote operations. In the fields of aerospace manipulator operation [28, 45], extraterrestrial ground exploration [8], nuclear environment maintenance [46, 15], remote medical surgery [62, 12], and life care assistance [44], teleoperation already has a wide range of technical needs and successful application experience. The rise and prosperity of Metaverse technology have promoted the applications of virtual reality (VR) in industrial teleoperation [67, 9, 48]. The immersion of VR can provide a more realistic experience for the teleoperation.

artificial intelligence, human computer interaction, participant, (15 more...)

arXiv.org Artificial Intelligence

2509.01996

Country:

North America > United States (0.94)
Asia > China > Hong Kong (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Health Care Technology (0.66)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Uncertainty-Resilient Active Intention Recognition for Robotic Assistants

Saborío, Juan Carlos, Vinci, Marc, Lima, Oscar, Stock, Sebastian, Niecksch, Lennart, Günther, Martin, Sung, Alexander, Hertzberg, Joachim, Atzmüller, Martin

arXiv.org Artificial IntelligenceAug-27-2025

-- Purposeful behavior in robotic assistants requires the integration of multiple components and technological advances. Often, the problem is reduced to recognizing explicit prompts, which limits autonomy, or is oversimplified through assumptions such as near-perfect information. We argue that a critical gap remains unaddressed - specifically, the challenge of reasoning about the uncertain outcomes and perception errors inherent to human intention recognition. In response, we present a framework designed to be resilient to uncertainty and sensor noise, integrating real-time sensor data with a combination of planners. Our integrated framework has been successfully tested on a physical robot with promising results. Robotic assistants may be integrated into modern industrial environments, e.g., delivering tools, parts or modules interleaved with tidying the workspace. Such tasks, however, require a combination of robust planning, navigation, grasping, and perception-particularly when explicit commands are not available and the robot must identify and pursue goals, in collaborative spaces shared with people.

artificial intelligence, machine learning, recognition, (14 more...)

arXiv.org Artificial Intelligence

2508.1915

Country: Europe > Germany (0.68)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)

Add feedback

Last Layer Hamiltonian Monte Carlo

Vellenga, Koen, Steinhauer, H. Joe, Falkman, Göran, Andersson, Jonas, Sjögren, Anders

arXiv.org Artificial IntelligenceJul-15-2025

We explore the use of Hamiltonian Monte Carlo (HMC) sampling as a probabilistic last layer approach for deep neural networks (DNNs). While HMC is widely regarded as a gold standard for uncertainty estimation, the computational demands limit its application to large-scale datasets and large DNN architectures. Although the predictions from the sampled DNN parameters can be parallelized, the computational cost still scales linearly with the number of samples (similar to an ensemble). Last layer HMC (LL--HMC) reduces the required computations by restricting the HMC sampling to the final layer of a DNN, making it applicable to more data-intensive scenarios with limited computational resources. In this paper, we compare LL-HMC against five last layer probabilistic deep learning (LL-PDL) methods across three real-world video datasets for driver action and intention. We evaluate the in-distribution classification performance, calibration, and out-of-distribution (OOD) detection. Due to the stochastic nature of the probabilistic evaluations, we performed five grid searches for different random seeds to avoid being reliant on a single initialization for the hyperparameter configurations. The results show that LL--HMC achieves competitive in-distribution classification and OOD detection performance. Additional sampled last layer parameters do not improve the classification performance, but can improve the OOD detection. Multiple chains or starting positions did not yield consistent improvements.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.08905

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (0.67)
Automobiles & Trucks (0.46)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
(2 more...)

Add feedback

MindEye-OmniAssist: A Gaze-Driven LLM-Enhanced Assistive Robot System for Implicit Intention Recognition and Task Execution

Zhang, Zejia, Yang, Bo, Chen, Xinxing, Shi, Weizhuang, Wang, Haoyuan, Luo, Wei, Huang, Jian

arXiv.org Artificial IntelligenceMar-17-2025

A promising effective human-robot interaction in assistive robotic systems is gaze-based control. However, current gaze-based assistive systems mainly help users with basic grasping actions, offering limited support. Moreover, the restricted intent recognition capability constrains the assistive system's ability to provide diverse assistance functions. In this paper, we propose an open implicit intention recognition framework powered by Large Language Model (LLM) and Vision Foundation Model (VFM), which can process gaze input and recognize user intents that are not confined to predefined or specific scenarios. Furthermore, we implement a gaze-driven LLM-enhanced assistive robot system (MindEye-OmniAssist) that recognizes user's intentions through gaze and assists in completing task. To achieve this, the system utilizes open vocabulary object detector, intention recognition network and LLM to infer their full intentions. By integrating eye movement feedback and LLM, it generates action sequences to assist the user in completing tasks. Real-world experiments have been conducted for assistive tasks, and the system achieved an overall success rate of 41/55 across various undefined tasks. Preliminary results show that the proposed method holds the potential to provide a more user-friendly human-computer interaction interface and significantly enhance the versatility and effectiveness of assistive systems by supporting more complex and diverse task.

intention, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.1325

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Intention Recognition in Real-Time Interactive Navigation Maps

Zhao, Peijie, Arefin, Zunayed, Meneguzzi, Felipe, Pereira, Ramon Fraga

arXiv.org Artificial IntelligenceFeb-24-2025

In this demonstration, we develop IntentRec4Maps, a system to recognise users' intentions in interactive maps for real-world navigation. IntentRec4Maps uses the Google Maps Platform as the real-world interactive map, and a very effective approach for recognising users' intentions in real-time. We showcase the recognition process of IntentRec4Maps using two different Path-Planners and a Large Language Model (LLM). GitHub: https://github.com/PeijieZ/IntentRec4Maps

intention recognition, location point, recognition, (12 more...)

arXiv.org Artificial Intelligence

2502.17581

Country:

South America > Brazil (0.15)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.05)
Asia > China > Hong Kong (0.05)
Europe > United Kingdom > England > Greater London > London > Kensington and Chelsea (0.04)

Genre: Research Report (0.40)

Industry: Transportation > Infrastructure & Services (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Fitting Different Interactive Information: Joint Classification of Emotion and Intention

Li, Xinger, Zhong, Zhiqiang, Huang, Bo, Yang, Yang

arXiv.org Artificial IntelligenceJan-5-2025

This paper is the first-place solution for ICASSP MEIJU@2025 Track I, which focuses on low-resource multimodal emotion and intention recognition. How to effectively utilize a large amount of unlabeled data, while ensuring the mutual promotion of different difficulty levels tasks in the interaction stage, these two points become the key to the competition. In this paper, pseudo-label labeling is carried out on the model trained with labeled data, and samples with high confidence and their labels are selected to alleviate the problem of low resources. At the same time, the characteristic of easy represented ability of intention recognition found in the experiment is used to make mutually promote with emotion recognition under different attention heads, and higher performance of intention recognition is achieved through fusion. Finally, under the refined processing data, we achieve the score of 0.5532 in the Test set, and win the championship of the track.

artificial intelligence, machine learning, recognition, (11 more...)

arXiv.org Artificial Intelligence

2501.06215

Country: Asia > China (0.21)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.37)

Add feedback

Towards Intention Recognition for Robotic Assistants Through Online POMDP Planning

Saborio, Juan Carlos, Hertzberg, Joachim

arXiv.org Artificial IntelligenceNov-26-2024

Intention recognition, or the ability to anticipate the actions of another agent, plays a vital role in the design and development of automated assistants that can support humans in their daily tasks. In particular, industrial settings pose interesting challenges that include potential distractions for a decision-maker as well as noisy or incomplete observations. In such a setting, a robotic assistant tasked with helping and supporting a human worker must interleave information gathering actions with proactive tasks of its own, an approach that has been referred to as active goal recognition. In this paper we describe a partially observable model for online intention recognition, show some preliminary experimental results and discuss some of the challenges present in this family of problems.

goal condition, observer, recognition, (13 more...)

arXiv.org Artificial Intelligence

2411.17326

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(10 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (0.94)

Add feedback

Learning Multimodal Confidence for Intention Recognition in Human-Robot Interaction

Zhao, Xiyuan, Li, Huijun, Miao, Tianyuan, Zhu, Xianyi, Wei, Zhikai, Song, Aiguo

arXiv.org Artificial IntelligenceMay-22-2024

The rapid development of collaborative robotics has provided a new possibility of helping the elderly who has difficulties in daily life, allowing robots to operate according to specific intentions. However, efficient human-robot cooperation requires natural, accurate and reliable intention recognition in shared environments. The current paramount challenge for this is reducing the uncertainty of multimodal fused intention to be recognized and reasoning adaptively a more reliable result despite current interactive condition. In this work we propose a novel learning-based multimodal fusion framework Batch Multimodal Confidence Learning for Opinion Pool (BMCLOP). Our approach combines Bayesian multimodal fusion method and batch confidence learning algorithm to improve accuracy, uncertainty reduction and success rate given the interactive condition. In particular, the generic and practical multimodal intention recognition framework can be easily extended further. Our desired assistive scenarios consider three modalities gestures, speech and gaze, all of which produce categorical distributions over all the finite intentions. The proposed method is validated with a six-DoF robot through extensive experiments and exhibits high performance compared to baselines.

intention, interaction, modality, (15 more...)

arXiv.org Artificial Intelligence

2405.14116

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.63)

Add feedback

Intelligent Mode-switching Framework for Teleoperation

Kizilkaya, Burak, She, Changyang, Zhao, Guodong, Imran, Muhammad Ali

arXiv.org Artificial IntelligenceFeb-8-2024

Teleoperation can be very difficult due to limited perception, high communication latency, and limited degrees of freedom (DoFs) at the operator side. Autonomous teleoperation is proposed to overcome this difficulty by predicting user intentions and performing some parts of the task autonomously to decrease the demand on the operator and increase the task completion rate. However, decision-making for mode-switching is generally assumed to be done by the operator, which brings an extra DoF to be controlled by the operator and introduces extra mental demand. On the other hand, the communication perspective is not investigated in the current literature, although communication imperfections and resource limitations are the main bottlenecks for teleoperation. In this study, we propose an intelligent mode-switching framework by jointly considering mode-switching and communication systems. User intention recognition is done at the operator side. Based on user intention recognition, a deep reinforcement learning (DRL) agent is trained and deployed at the operator side to seamlessly switch between autonomous and teleoperation modes. A real-world data set is collected from our teleoperation testbed to train both user intention recognition and DRL algorithms. Our results show that the proposed framework can achieve up to 50% communication load reduction with improved task completion probability.

completion probability, probability, teleoperation, (16 more...)

arXiv.org Artificial Intelligence

2402.06047

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Designing deep neural networks for driver intention recognition

Vellenga, Koen, Steinhauer, H. Joe, Karlsson, Alexander, Falkman, Göran, Rhodin, Asli, Koppisetty, Ashok

arXiv.org Artificial IntelligenceFeb-7-2024

Driver intention recognition studies increasingly rely on deep neural networks. Deep neural networks have achieved top performance for many different tasks, but it is not a common practice to explicitly analyse the complexity and performance of the network's architecture. Therefore, this paper applies neural architecture search to investigate the effects of the deep neural network architecture on a real-world safety critical application with limited computational capabilities. We explore a pre-defined search space for three deep neural network layer types that are capable to handle sequential data (a long-short term memory, temporal convolution, and a time-series transformer layer), and the influence of different data fusion strategies on the driver intention recognition performance. A set of eight search strategies are evaluated for two driver intention recognition datasets. For the two datasets, we observed that there is no search strategy clearly sampling better deep neural network architectures. However, performing an architecture search does improve the model performance compared to the original manually designed networks. Furthermore, we observe no relation between increased model complexity and higher driver intention recognition performance. The result indicate that multiple architectures yield similar performance, regardless of the deep neural network layer type or fusion strategy.

architecture, neural network, search strategy, (15 more...)

arXiv.org Artificial Intelligence

2402.0515

Country:

Europe > Sweden (0.04)
Oceania > Australia (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Industry:

Transportation (0.70)
Automobiles & Trucks (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback